Fast Encoding Method for Lossless Coding

ABSTRACT

An apparatus comprising a processor configured to receive a current block of a video frame, and determine a coding mode for the current block based on only a bit rate cost function, wherein the coding mode is selected from a plurality of available coding modes, and wherein calculation of the bit rate cost function does not consider distortion of the current block.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application claims priority to U.S. Provisional PatentApplication No. 61/503,534 filed Jun. 30, 2011 by Wen Gao et al. andentitled “Lossless Coding Tools for Compound Video”, and U.S.Provisional Patent Application No. 61/506,958 filed Jul. 12, 2011 by WenGao et al. and entitled “Additional Lossless Coding Tools for CompoundVideo”, each of which is incorporated herein by reference as ifreproduced in its entirety.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

Not applicable.

REFERENCE TO A MICROFICHE APPENDIX

Not applicable.

BACKGROUND

The amount of video data needed to depict even a relatively short filmcan be substantial, which may result in difficulties when the data is tobe streamed or otherwise communicated across a communications networkwith limited bandwidth capacity. Thus, video data is generallycompressed prior to being communicated across modern daytelecommunications networks. Video compression devices often usesoftware and/or hardware at the source to code the video data prior totransmission, thereby decreasing the quantity of data needed torepresent digital video images. The compressed data is then received atthe destination by a video decompression device that decodes the videodata. Due to limited network resources, improved compression anddecompression techniques that increase compression ratios withoutsubstantially reducing image quality are desirable.

SUMMARY

In one embodiment, the disclosure includes an apparatus comprising aprocessor configured to receive a current block of a video frame, anddetermine a coding mode for the current block based on only a bit ratecost function, wherein the coding mode is selected from a plurality ofavailable coding modes, and wherein calculation of the bit rate costfunction does not consider distortion of the current block.

In another embodiment, the disclosure includes a method comprisingreceiving a current block of a video frame, and determining a codingmode for the current block based on only a bit rate cost function,wherein the coding mode is selected from a plurality of available codingmodes, and wherein calculation of the bit rate cost function does notconsider distortion of the current block.

In yet another embodiment, the disclosure includes an apparatus used invideo coding comprising a processor configured to for each of aplurality of pixels in a block, determine a difference with one of aplurality of corresponding pixels in a reference block, wherein eachdifference is based on two color values of a pair of compared pixels,and if each of the differences is within a pre-set boundary, generateinformation to signal the block as a skipped block, wherein theinformation identifies the block and the reference block, and includethe information into a bitstream without further encoding of the block.

In yet another embodiment, the disclosure includes a method used invideo coding comprising for each of a plurality of pixels in a block,determining a difference with one of a plurality of corresponding pixelsin a reference block, wherein each difference is based on two colorvalues of a pair of compared pixels, and if each of the differences iswithin a pre-set boundary, generating information to signal the block asa skipped block, wherein the information identifies the block and thereference block, and including the information into a bitstream withoutfurther encoding of the block.

These and other features will be more clearly understood from thefollowing detailed description taken in conjunction with theaccompanying drawings and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

For a more complete understanding of this disclosure, reference is nowmade to the following brief description, taken in connection with theaccompanying drawings and detailed description, wherein like referencenumerals represent like parts.

FIG. 1 is a schematic diagram of an embodiment of a transform bypassencoding scheme.

FIG. 2 is a schematic diagram of an embodiment of a transform bypassdecoding scheme.

FIG. 3 is a schematic diagram of an embodiment of a transform withoutquantization encoding scheme.

FIG. 4 is a schematic diagram of an embodiment of a transform withoutquantization decoding scheme.

FIG. 5 is a schematic diagram of an embodiment of a lossy encodingscheme.

FIG. 6 is a schematic diagram of an embodiment of a lossy decodingscheme.

FIG. 7 is a flowchart of an embodiment of an encoding method.

FIG. 8 is a flowchart of an embodiment of a decoding method.

FIG. 9 is a flowchart of an embodiment of an encoding mode selectionmethod.

FIG. 10 is a schematic diagram of an embodiment of a network unit.

FIG. 11 is a schematic diagram of a general-purpose computer system.

DETAILED DESCRIPTION

It should be understood at the outset that, although an illustrativeimplementation of one or more embodiments are provided below, thedisclosed systems and/or methods may be implemented using any number oftechniques, whether currently known or in existence. The disclosureshould in no way be limited to the illustrative implementations,drawings, and techniques illustrated below, including the exemplarydesigns and implementations illustrated and described herein, but may bemodified within the scope of the appended claims along with their fullscope of equivalents.

Typically, video media involves displaying a sequence of still images orframes in relatively quick succession, thereby causing a viewer toperceive motion. Each frame may comprise a plurality of picture elementsor pixels, each of which may represent a single reference point in theframe. During digital processing, each pixel may be assigned an integervalue (e.g., 0, 1, . . . or 255) that represents an image quality orcharacteristic, such as luminance or chrominance, at the correspondingreference point. In use, an image or video frame may comprise a largeamount of pixels (e.g., 2,073,600 pixels in a 1920×1080 frame), thus itmay be cumbersome and inefficient to encode and decode (referred tohereinafter simply as code) each pixel independently. To improve codingefficiency, a video frame is usually broken into a plurality ofrectangular blocks or macroblocks, which may serve as basic units ofprocessing such as prediction, transform, and quantization. For example,a typical N×N block may comprise N² pixels, where N is an integergreater than one and is often a multiple of four.

In a working draft of the International Telecommunications Union (ITU)Telecommunications Standardization Sector (ITU-T) and the InternationalOrganization for Standardization (ISO)/International ElectrotechnicalCommission (IEC), High Efficiency Video Coding (HEVC), which is poisedto be the next video standard, new block concepts have been introduced.For example, coding unit (CU) may refer to a sub-partitioning of a videoframe into rectangular blocks of equal or variable size. In HEVC, a CUmay replace macroblock structure of previous standards. Depending on amode of inter or intra prediction, a CU may comprise one or moreprediction units (PUs), each of which may serve as a basic unit ofprediction. For example, for intra prediction, a 64×64 CU may besymmetrically split into four 32×32 PUs. For another example, for aninter prediction, a 64×64 CU may be asymmetrically split into a 16×64 PUand a 48×64 PU. Similarly, a PU may comprise one or more transform units(TUs), each of which may serve as a basic unit for transform and/orquantization. For example, a 32×32 PU may be symmetrically split intofour 16×16 TUs. Multiple TUs of one PU may share a same prediction mode,but may be transformed separately. Herein, the term block may generallyrefer to any of a macroblock, CU, PU, or TU.

Depending on the application, a block may be coded in either a losslessmode (i.e., no distortion or information loss) or a lossy mode (i.e.,with distortion). In use, high quality videos (e.g., with YUVsubsampling of 4:4:4) may be coded using a lossless mode, while lowquality videos (e.g., with YUV subsampling of 4:2:0) may be coded usinga lossy mode. Sometimes, a single video frame or slice (e.g., with YUVsubsampling of either 4:4:4 or 4:2:0) may employ both lossless and lossymodes to code a plurality of regions, which may be rectangular orirregular in shape. Each region may comprise a plurality of blocks. Forexample, a compound video may comprise a combination of different typesof contents, such as texts, computer graphics, and natural-view content(e.g., camera-captured video). In a compound frame, regions of texts andgraphics may be coded in a lossless mode, while regions of natural-viewcontent may be coded in a lossy mode. Lossless coding of texts andgraphics may be desired, e.g. in computer screen sharing applications,since lossy coding may lead to poor quality or fidelity of texts andgraphics, which may cause eye fatigue. Current HEVC test models (HMs),such as HM 3.0, may code natural-view content fairly efficiently.However, the current HMs may lack a lossless coding mode for certainvideos, thus their coding efficiency and speed may be limited.

In lossy coding schemes of current HMs, a bit rate and distortion of acoded video may need to be balanced. To achieve low distortion, oftenmore information (e.g., pixel values or transform coefficients) needs tobe encoded, leading to more encoded bits and thus a higher bit rate. Onthe other hand, to achieve a smaller bit rate, certain information mayneed to be removed. For example, through a two-dimensional transformoperation, pixel values in a spatial domain are converted to transformcoefficients in a frequency domain. In a transform coefficient matrix,high-index transform coefficients (e.g., in bottom-right corner)corresponding to small spatial features may have relatively smallvalues. Thus, in a subsequent quantization operation, largerquantization coefficients may be applied on the high-index transformcoefficients. After integer rounding, a number of zero-valued transformcoefficients may be created in the high-index positions, which may thenbe skipped in following encoding steps. Although quantization may lowerthe bit rate, information for small spatial features may be lost in thecoding process. The lost information may be irretrievable, thusdistortion may be increased and coding fidelity lowered in the decodedvideo.

In use, there may be a plurality of coding modes to code a video frame.For example, a particular slice of the frame may use various blockpartitions (number, size, and shape). For each partition, if interprediction is to be used, there may be various motion vectors associatedwith one or more reference frames. Otherwise, if intra prediction is tobe used, there may be various reference pixels corresponding to variousintra prediction modes. Each coding mode may lead to a different bitrate and/or distortion. Thus, a rate-distortion optimization (RDO)module in a video encoder may be configured to select a best coding modefrom the plurality of coding modes to determine an optimal balance ortrade-off between the bit rate and distortion.

Current HMs may jointly evaluate an overall cost of bit rate anddistortion by using a joint rate-distortion (RD) cost. For example, abit rate (denoted as R) and a distortion cost (denoted as D) may becombined into a single joint rate-distortion (RD) cost (denoted as J),which may be mathematically presented as:

J=D+λR

where λ is a Lagrangian coefficient representing the relationshipbetween a bit rate and a particular quality level.

Various mathematical metrics may be used to calculate distortion, suchas a sum of squared distortion (SSD), sum of absolute error (SAE), sumof absolute differences (SAD), mean of absolute difference (MAD), ormean of squared errors (MSE). Using any of these distortion metrics, theRDO process may attempt to find a coding mode that minimizes J.

In current HMs, the selection of an optimal coding mode in an encodermay be a complex process. For example, for every available coding mode(denoted as m) of every block, the encoder may code the block using modem and calculate R, which is the number of bits required to code theblock. Then, the encoder may reconstruct the block and calculate D,which is a difference between the original and reconstructed block.Then, the encoder may calculate the mode cost J_(m) using the equationabove. This process may be repeated for every available coding mode.Then, the encoder may choose a mode that gives the minimum J_(m). TheRDO process in the encoder may be a computationally intensive process,since there may be potentially hundreds of possible coding modes, e.g.,based on various combinations of block sizes, inter prediction frames,intra prediction directions. Both R and D of the block may need to becalculated hundreds of times before the best coding mode may bedetermined.

In addition, when a sequence of video frames is being coded, sometimescertain regions may remain stable for a relatively long period of time.For example, in video conferencing applications, a background region ofeach user may remain unchanged for tens of minutes. In current encoders,the RDO module may still evaluate bit rate and/or distortion for blocksin these regions, which may consume valuable computation resource andtime.

Disclosed herein are systems and methods for improved video coding. Thedisclosure provides a lossless coding mode and a forced skip mode, whichmay complement a lossy coding mode in coding of a video such as acompound video. The lossless mode may include a transform bypass codingscheme and a transform without quantization coding scheme. In losslesscoding of a block, since no distortion (or only slight distortion) maybe induced, the RDO mode selection process may be simplified. In anembodiment, only a bit rate portion of a joint RD cost is preserved.Thus, from a plurality of available coding modes, the RDO process mayonly need to determine an optimal coding mode that leads to a leastnumber of bits. A reconstructed block may not need to be compared withan original source block, which may save both computation resource andtime. Furthermore, if a video frame or slice comprises one or moreregions which remain stable for a relatively long period (e.g., tens ofseconds or minutes), the RDO process may implement a forced skip mode inthe one or more regions. In an embodiment of the forced skip mode, if aCU is found to be an exact match (or an approximate match withdifference in a pre-set boundary) with a corresponding reference CU in areference frame, the CU may be skipped in the rest of the encodingsteps. Due to implementation of the simplified RDO mode selection schemeand the forced skip mode, videos may be coded both faster and moreefficiently.

In use, there may be a module before an encoder to analyze contents of avideo frame, and identify certain regions (e.g., texts and/or graphicsregions) where lossless encoding is desired. Information or instructionsregarding which regions to encode in a lossless mode may be passed tothe encoder. Based on the information, the encoder may encode theidentified regions using the lossless mode. Alternatively, a user maymanually define certain regions to be encoded using a lossless mode, andprovide the encoder with information identifying these regions. Thus, avideo (e.g., a compound video) may be encoded in a lossless mode and/ora lossy mode, depending on information received by the encoder. Herein,the lossless encoding mode may include transform bypass encoding andtransform without quantization encoding. These two lossless encodingschemes as well as a lossy encoding scheme are described herein.

Likewise, based on information contained in a received bitstream, avideo decoder may decode a video frame using a lossless mode and/or alossy mode. The lossless decoding mode may include transform bypassdecoding and transform without quantization decoding. The two losslessdecoding schemes as well as a lossy decoding scheme are describedherein.

FIG. 1 illustrates an embodiment of a transform bypass encoding scheme100, which may be implemented in a video encoder. The transform bypassencoding scheme 100 may comprise a rate-distortion optimization (RDO)module 110, a prediction module 120, an entropy encoder 130, and areconstruction module 140 arranged as shown in FIG. 1. In operation, aninput video comprising a sequence of video frames (or slices) may bereceived by the encoder. Herein, a frame may refer to any of a predictedframe (P-frame), an intra-coded frame (I-frame), or a bi-predictiveframe (B-frame). Likewise, a slice may refer to any of a P-slice, anI-slice, or a B-slice.

The RDO module 110 may be configured to make logic decisions for one ormore of other modules. In an embodiment, based on one or more previouslyencoded frames, the RDO module 110 may determine how a current frame (orslice) being encoded is partitioned into a plurality of CUs, and how aCU is partitioned into one or more PUs and TUs. For example, homogeneousregions of the current frame (i.e., no or slight difference frompreviously encoded frames) may be partitioned into relatively largerblocks, and detailed regions of the current frame (i.e., significantdifference from previously encoded frames) may be partitioned intorelatively smaller blocks.

In addition, the RDO module 110 may control the prediction module 120 bydetermining how the current frame is predicted. The current frame may bepredicted via inter and/or intra prediction. Inter prediction (i.e.,inter frame prediction) may exploit temporal redundancies in a sequenceof frames, e.g. similarities between corresponding blocks of successiveframes, to reduce compression data. In inter prediction, the RDO module110 may determine a motion vector of a block in the current frame basedon a corresponding block in one or more reference frames. On the otherhand, intra prediction may exploit spatial redundancies within a singleframe, e.g., similarities between adjacent blocks, to reduce compressiondata. In intra prediction, reference pixels adjacent to a current blockmay be used to generate a prediction block. Intra prediction (i.e.,intra frame prediction) may be implemented using any of a plurality ofavailable prediction modes or directions (e.g., 34 modes in HEVC), whichmay be determined by the RDO module 110. For example, the RDO module 110may calculate a sum of absolute error (SAE) for each prediction mode,and select a prediction mode that results in the smallest SAE.

Based on logic decisions made by the RDO module 110, the predictionmodule 120 may utilize either one or more reference frames (interprediction) or a plurality of reference pixels (intra prediction) togenerate a prediction block, which may be an estimate of a currentblock. Then, the current block may be subtracted by the predictionblock, thereby generating a residual block. The residual block maycomprise a plurality of residual values, each of which may indicate adifference between a pixel in the current block and a correspondingpixel in the prediction block. Then, all values of the residual blockmay be scanned and encoded by the entropy encoder 130 into an encodedbitstream. The entropy encoder 130 may employ any entropy encodingscheme, such as context-adaptive binary arithmetic coding (CABAC)encoding, exponential Golomb encoding, or fixed length encoding, or anycombination thereof. In the transform bypass encoding scheme 100, sincethe residual block is encoded without a transform step or a quantizationstep, no information loss may be induced in the encoding process.

To facilitate continuous encoding of video frames, the residual blockmay also be fed into the reconstruction module 140, which may generateeither reference pixels for intra prediction of future blocks orreference frames for inter prediction of future frames. If desired,filtering may be performed on the reference frames/pixels before theyare used for inter/intra prediction. A person skilled in the art isfamiliar with the functioning of the prediction module 120 and thereconstruction module 140, so these modules will not be furtherdescribed. It should be noted that FIG. 1 may be a simplifiedillustration of a video encoder, thus it may only include a portion ofmodules present in the encoder. Other modules (e.g., filter, scanner,and transmitter), although not shown in FIG. 1, may also be included tofacilitate video encoding. Prior to transmission from the encoder, theencoded bitstream may be further configured to include otherinformation, such as video resolution, frame rate, block partitioninginformation (sizes, coordinates), prediction modes, etc., so that theencoded sequence of video frames may be properly decoded.

FIG. 2 illustrates an embodiment of a transform bypass decoding scheme200, which may be implemented in a video decoder. The transform bypassdecoding scheme 200 may correspond to the transform bypass encodingscheme 100, and may comprise an entropy decoder 210, a prediction module220, and a reconstruction module 230 arranged as shown in FIG. 2. Inoperation, an encoded bitstream containing information of a sequence ofvideo frames may be received by the entropy decoder 210, which maydecode the bitstream to an uncompressed format. The entropy decoder 210may employ any entropy decoding scheme, such as CABAC decoding,exponential Golomb decoding, or fixed length encoding, or anycombination thereof.

For a current block being decoded, a residual block may be generatedafter the execution of the entropy decoder 210. In addition, informationcontaining a prediction mode of the current block may also be decoded bythe entropy decoder 210. Then, based on the prediction mode, theprediction module 220 may generate a prediction block for the currentblock based on previously decoded blocks or frames. If the predictionmode is an inter mode, one or more previously decoded reference framesmay be used to generate the prediction block. Otherwise, if theprediction mode is an intra mode, a plurality of previously decodedreference pixels in reference blocks may be used to generate theprediction block. Then, the reconstruction module 230 may combine theresidual block with the prediction block to generate a reconstructedblock. Additionally, to facilitate continuous decoding of video frames,the reconstructed block may be used in a reference frame to interpredict future frames. Some pixels of the reconstructed block may alsoserve as reference pixels for intra prediction of future blocks in thesame frame.

In use, if an original block is encoded and decoded using losslessschemes, such as the transform bypass encoding scheme 100 and thetransform bypass decoding scheme 200, no information loss may be inducedin the entire coding process. Thus, barring distortion caused duringtransmission, a reconstructed block may be exactly the same with theoriginal block. This high fidelity of coding may improve a user'sexperience in viewing video contents such as texts and graphics.

During lossless coding of certain regions in a video frame, sometimes itmay be desirable to include a transform step into the coding process.For example, for some blocks of a text region, an added transform stepmay generate a shorter bitstream compared to a transform bypass codingscheme. In an embodiment, a RDO module may be configured to determinewhether to include the transform step. For example, a test transform maybe performed to convert a residual block to a matrix of transformcoefficients. If a number of bits needed to encode transformcoefficients may be smaller compared to a number of bits needed toencode residual values without transform in the residual block, thetransform step may be included. Otherwise, the transform step may bebypassed. FIG. 3 illustrates an embodiment of a transform withoutquantization encoding scheme 300, which may comprise a RDO module 310, aprediction module 320, a transform module 330, an entropy encoder 340,an inverse transform module 350, and a reconstruction module 360. Someaspects of the transform without quantization encoding scheme 300 may bethe same or similar to the transform bypass encoding scheme 100 in FIG.1, thus the similar aspects will not be further described in theinterest of clarity.

The transform without quantization encoding scheme 300 may beimplemented in a video encoder, which may receive an input videocomprising a sequence of video frames. The RDO module 310 may beconfigured to control one or more of other modules, and may be the sameor similar to the RDO module 110 in FIG. 1. Based on logic decisionsmade by the RDO module 310, the prediction module 320 may utilize eitherreference frames (inter prediction) or reference pixels (intraprediction) to generate a prediction block, which is an estimate of acurrent block. Then, the current block may be subtracted by theprediction block, thereby generating a residual block. The predictionmodule 320 may be the same or similar to the prediction module 120 inFIG. 1.

Instead of being entropy encoded directly, the residual block in thetransform without quantization encoding scheme 300 may be firsttransformed from a spatial domain to a frequency domain by the transformmodule 330. The transform module 330 may convert the values of theresidual block (i.e., residual values) to a transform matrix comprisinga plurality of transform coefficients. The transform module 330 may beimplemented using any appropriate algorithm, such as a discrete cosinetransform (DCT), a fractal transform (FT), or a discrete wavelettransform (DWT). In use, some algorithms, such as a 4×4 integertransform defined in H.264/advanced video coding (AVC), may not induceany information loss, while other algorithms, such as an 8×8 integer DCTtransform defined in the HEVC working draft, may induce slightinformation loss. For example, since the 8×8 integer DCT transform inHEVC may not be fully reversible, recovered values of the residual blockafter the inverse transform module 350 may be slightly different (e.g.,up to ±2 values) from the original values of the residual block beforethe transform module 330. When slight information loss is induced, theencoding may be near lossless instead of lossless. However, comparedwith a quantization step, the information loss caused by the transformstep may be insignificant or unnoticeable, thus the transform withoutquantization encoding scheme 300 may also be included herein as part ofa lossless coding scheme.

Transform coefficients generated by the transform module 330 may bescanned and encoded by the entropy encoder 340 into an encodedbitstream. The entropy encoder 340 may be the same or similar with theentropy encoder 130. To facilitate continuous encoding of video frames,the transform coefficients may also be fed into the inverse transformmodule 350, which may perform the inverse of the transform module 330and generate an exact version (i.e., lossless) or an approximation(i.e., near lossless) of the residual block. Then, the residual blockmay be fed into the reconstruction module 360, which may generate eitherreference pixels for intra prediction of future blocks or referenceframes for inter prediction of future frames. The reconstruction module360 may be the same or similar to the reconstruction module 140 inFIG. 1. Prior to transmission from the encoder, the encoded bitstreammay include other information, such as video resolution, frame rate,block partitioning information (sizes, coordinates), prediction modes,etc., so that the encoded sequence of video frames may be properlydecoded.

FIG. 4 illustrates an embodiment of a transform without quantizationdecoding scheme 400, which may be implemented in a video decoder. Thewithout quantization decoding scheme 400 may correspond to the transformwithout quantization encoding scheme 300, and may comprise an entropydecoder 410, an inverse transform module 420, a prediction module 430,and a reconstruction module 440 arranged as shown in FIG. 4. Inoperation, an encoded bitstream containing information of a sequence ofvideo frames may be received by the entropy decoder 410, which maydecode the bitstream to an uncompressed format. The entropy decoder 410may be the same or similar to the entropy decoder 210 in FIG. 2.

After execution of the entropy decoder 410, a matrix of transformcoefficients may be generated, which may then be fed into the inversetransform module 420. The inverse transform module 420 may convert thetransform coefficients in a frequency domain to residual pixel values ina spatial domain. In use, depending on whether an algorithm used by theinverse transform module 420 is fully reversible, an exact version(i.e., lossless) or an approximation (i.e., near lossless) of theresidual block may be generated. The inverse transform module 420 may bethe same or similar with the inverse transform module 350 in FIG. 3.

In addition, information containing a prediction mode of the currentblock may also be decoded by the entropy decoder 410. Based on theprediction mode, the prediction module 430 may generate a predictionblock for the current block. The prediction module 430 may be the sameor similar with the prediction module 220 in FIG. 2. Then, thereconstruction module 440 may combine the residual block with theprediction block to generate a reconstructed block. Additionally, tofacilitate continuous decoding of video frames, the reconstructed blockmay be used in a reference frame to inter predict future frames. Somepixels of the reconstructed block may also serve as reference pixels forintra prediction of future blocks in the same frame.

In use, if an original block is encoded and decoded using near losslessschemes, such as the transform without quantization encoding scheme 300and the transform without quantization decoding scheme 400, only slightdistortion may be induced in the coding process. Thus, barringsignificant distortion caused during transmission, a reconstructed blockmay be almost the same with the original block. Transform withoutquantization coding schemes may be desired sometimes, as they mayachieve higher compression ratio than the transform bypass schemes,without noticeable sacrifice of coding fidelity.

As mentioned previously, in current encoders a RDO module may select anoptimal coding mode based on a joint RD cost. On the contrary, in eithera transform bypass lossless coding scheme or a transform withoutquantization coding scheme disclosed herein, a quantization step may bebypassed. Without information loss induced by quantization, thedistortion of an original current block due to encoding may be, if any,negligible. Thus, a RDO module (e.g., the RDO module 110 in FIG. 1 orthe RDO module 310 in FIG. 3) may exclude distortion from considerationin its selection of a best coding mode. With removal of the distortionfactor, only a bit rate portion of a joint RD cost function may bepreserved, thus the RD cost may be referred to as a bit rate cost. In anembodiment, the bit rate cost may be mathematically expressed as:

J=λR

Based on the disclosed bit rate cost function, the RDO module may test asubset or all of a plurality of available coding modes for a currentblock. Tested coding modes may vary in block size, motion vector, interprediction reference frame, intra prediction mode, or reference pixels,or any combination thereof. For each tested coding mode, a number ofbits may be calculated for a coded residual block of the current blockor a coded matrix of transform coefficients for the current block. Aftercomparing all resulted bit numbers, the RDO module may select a codingmode that results in a least number of bits.

In comparison with current encoders which calculate both D and R indetermining the optimal coding mode, the disclosed coding mode selectionscheme may be relatively simpler. For example, with removal of the Dportion, a reconstructed block may not need to be compared with itsoriginal block anymore. Thus, several calculation steps may be removedfrom the evaluation process in each coding mode, which may save codingtime and computation resources. Considering there may be potentiallyhundreds of coding modes for the current block in the evaluation, thesavings may be significant and encoding may be made faster, which maygreatly facilitate real-time encoding process.

Sometimes it may be unnecessary to code an entire video frame using alossless mode. For example, regions containing natural-view contents(e.g., captured by a low resolution camera) in a compound video may notrequire lossless coding, because the original video quality may alreadybe limited, or because distortion due to lossy coding may not besignificant. FIG. 5 illustrates an embodiment of a lossy encoding scheme500, which may be the same or similar with encoding schemes used incurrent HMs. The lossy encoding scheme 500 may comprise a RDO module510, a prediction module 520, a transform module 530, a quantizationmodule 540, an entropy encoder 550, a de-quantization module 560, aninverse transform module 570, and a reconstruction module 580. Someaspects of the lossy encoding scheme 500 may be the same or similar tothe transform without quantization encoding scheme 300 in FIG. 3, thusthe similar aspects will not be further described in the interest ofclarity.

The lossy encoding scheme 500 may be implemented in a video encoder,which may receive a sequence of video frames. The RDO module 510 may beconfigured to control one or more of other modules. Based on logicdecisions made by the RDO module 310, the prediction module 320 mayutilize either reference frames or reference pixels to generate aprediction block. Then, a current block from the input video may besubtracted by the prediction block to generate a residual block. Theresidual block may be fed into the transform module 530, which mayconvert residual pixel values into a matrix of transform coefficients.

In contrast to the transform without quantization encoding scheme 300,in the lossy encoding scheme 500, the transform coefficients may bequantized by the quantization module 540 before being fed into theentropy encoder 550. The quantization module 550 may alter the scale thetransform coefficients and round them to integers, which may reduce thenumber of non-zero coefficients. Consequently, a compression ratio maybe increased at a cost of information loss.

Quantized transform coefficients generated by the quantization module540 may be scanned. Non-zero-valued coefficients may be encoded by theentropy encoder 550 into an encoded bitstream. The quantized transformcoefficients may also be fed into the de-quantization module 560 torecover the original scale of the transform coefficients. Then, theinverse transform module 570 may perform the inverse of the transformmodule 530 and generate a noisy version of the original residual block.Then, the lossy residual block may be fed into the reconstruction module580, which may generate either reference pixels for intra prediction offuture blocks or reference frames for inter prediction of future frames.

FIG. 6 illustrates an embodiment of a lossy decoding scheme 600, whichmay be implemented in a video decoder. The lossy decoding scheme 600 maycorrespond to the lossy encoding scheme 500, and may comprise an entropydecoder 610, a de-quantization module 620, an inverse transform module630, a prediction module 640, and a reconstruction module 650 arrangedas shown in FIG. 6. In operation, an encoded bitstream containinginformation of a sequence of video frames may be received by the entropydecoder 610, which may decode the bitstream to an uncompressed format. Amatrix of quantized transform coefficients may be generated, which maythen be fed into the de-quantization module 620, which may be the sameor similar to the de-quantization module 560 in FIG. 5. Then, output ofthe de-quantization module 620 may be fed into the inverse transformmodule 630, which may convert transform coefficients to residual valuesof a residual block. In addition, information containing a predictionmode of the current block may also be decoded by the entropy decoder610. Based on the prediction mode, the prediction module 640 maygenerate a prediction block for the current block. Then, thereconstruction module 650 may combine the residual block with theprediction block to generate a reconstructed block. Additionally, tofacilitate continuous decoding, the reconstructed block may be used in areference frame to inter predict future frames. Some pixels of thereconstructed block may also serve as reference pixels for intraprediction of future blocks in the same frame.

In an embodiment, if desired, all of the aforementioned encodingschemes, including the transform bypass encoding scheme 100, thetransform without quantization encoding scheme 300, and the lossyencoding scheme 500, may be implemented in a single encoder. Forexample, when encoding a compound video, the encoder may receiveinformation regarding which regions should be encoded in a lossless modeand/or which regions should be encoded in a lossy mode. Based on theinformation, the encoder may encode certain regions using a lossy modeand other regions using a lossless mode. In the lossless mode, a RDOmodule (e.g., the RDO module 110 in FIG. 1) of the encoder may determinewhether to bypass a transform step, after comparing bitstream lengthsresulted by the transform bypass encoding scheme 100 and the transformwithout quantization encoding scheme 300. Similarly, if desired, all ofthe aforementioned decoding schemes, including the transform bypassdecoding scheme 200, the transform without quantization decoding scheme400, and the lossy decoding scheme 600, may be implemented in a singledecoder.

For a decoder to properly reconstruct an encoded video frame, it shouldrecognize one or more encoding schemes that have been used to encode thevideo frame. Since lossless encoding may be applied only on some regionsof the video frame (referred to hereinafter as lossless encodingregions), lossy encoding may be applied on the other regions (referredto hereinafter as lossy or regular encoding regions). Informationsignaling lossless encoding regions and/or lossy encoding regions may beconveyed in a bitstream that carries the encoded video frame. In use,such information may be packed in a high level syntax structure, such asa sequence parameter set (SPS) or a picture parameter set (PPS) of thebitstream. A SPS or PPS may be a key normative part of the bitstream,and may be defined by a video coding standard. After receiving of thebitstream, the decoder may extract region indication information fromthe SPS or PPS, and then reconstruct each region according to itsencoding mode. In an embodiment, the SPS or PPS may include a number ofrectangular lossless encoding regions as well as information identifyingtheir positions in the video frame (e.g., top-left and bottom-rightcoordinates, or top-right and bottom-left coordinates). In anotherembodiment, the SPS or PPS may include a number of rectangular lossyencoding regions as well as information identifying their positions inthe video frame (e.g., top-left and bottom-right coordinates, ortop-right and bottom-left coordinates).

In some applications, such as sharing a screen during a videoconference, certain regions of a video may remain stable across aplurality of video frames. In this case, region indication informationmay only change at a relatively low frequency (e.g., once in tens ofseconds), thus bitrate overhead caused by this signaling method may benegligible.

Within a lossless encoding region, a transform bypass scheme and/or atransform without quantization scheme may be used. To allow properdecoding, a bitstream may also contain information regarding whichblocks have been encoded via the transform bypass scheme and whichblocks via the transform without quantization scheme. In an embodiment,two transform bypass flags may be introduced for each PU in the losslessencoding region. A luminance (luma) transform bypass flag may indicatewhether a transform step is bypassed (or skipped) in the coding of lumapixels of a PU, and a chrominance (chroma) transform bypass flag mayindicate whether a transform step is bypassed in the coding of chromapixels of the PU. For example, if a transform module (e.g., thetransform module 330 in FIG. 3) is bypassed for the luma pixels, theluma transform bypass flag may be set to ‘1’. Otherwise, if thetransform module is used and a quantization module (e.g., thequantization module 540) is bypassed, the luma transform bypass flag maybe set to ‘0’. Alternatively, if desired, the luma transform bypass flagmay be set to ‘0’ if the transform module is bypassed, and ‘1’ if thetransform module is used. The chroma transform bypass flag may be setusing a same or similar approach with the luma transform bypass flag.

Both the luma and chroma transform bypass flags may be encoded by anentropy encoder (e.g., the entropy encoder 130 in FIG. 1). The entropyencoder may use a CABAC algorithm, which may use a plurality of contextmodels. In an embodiment, three context models may be used for each ofthe luma and chroma transform bypass flags. To improve codingefficiency, the entropy encoder may select a context model based on anindex, which may be correlated to transform bypass flags of adjacentPUs. Consider, for example, the coding of a luma transform bypass flagfor a current PU, with the assumption that a chroma transform bypassflag for the current PU may be coded in a same or similar way. Twoadjacent PUs—an upper PU and a left PU—may also have luma transformbypass flags. A sum of the two luma transform bypass flags may beconfigured to be the index of the context models. If either the upper PUor the left PU does not have a luma transform bypass flag (e.g., thecurrent PU on a boundary of a lossless encoding region), ‘0’ may beassigned to the luma transform bypass flag. After entropy encoding usingthe selected context model, the encoded luma and chroma transform flagsmay be included into the bit stream.

In an embodiment, the luma and chroma components of a PU may share asame lossless coding scheme, and both components may bypass or include atransform step in their coding process. In this case, a single transformbypass flag may be used for both components. Compared with separatetransform bypass flags for the luma and chroma components, the singletransform bypass flag may lead to less signaling overhead in thebitstream. Moreover, it should be noted that, although transform bypassflags (luma and/or chroma) are set on the PU level in the descriptionsabove, if desired, the transform bypass flags may also be similarly seton a TU level, which may result in finer granularity but more signalingoverhead.

FIG. 7 is a flowchart of an embodiment of an encoding method 700, whichmay implement some or all of the aforementioned encoding schemes in avideo encoder. The method 700 may start in step 702, where an inputvideo comprising a sequence of video frames or slices may be received.For each frame or a set of frames, information or instructionsindicating one or more lossless encoding regions and/or lossy encodingregions may also be received. Next, in step 703, region indicationinformation may be added to a high level syntax of the compressedbitstream, which may identify these lossless encoding regions and/orlossy encoding regions. The syntax may be included in the SPS or PPS ofa bitstream. In an embodiment, the region indication information mayinclude a number of rectangular lossless encoding regions and theirpositions in the video frame (e.g., top-left and bottom-rightcoordinates, or top-right and bottom-left coordinates). In anotherembodiment, the region indication information may include a number ofrectangular lossy encoding regions and their positions in the videoframe (e.g., top-left and bottom-right coordinates, or top-right andbottom-left coordinates).

Next, in step 704, based on received information, the method 700 maydetermine if a region (e.g., rectangular) currently being encoded is alossless encoding region. If the condition in the block 704 is met, themethod 700 may proceed to step 706 to encode the current region in alossless mode (e.g., using the transform bypass encoding scheme 100and/or the transform without quantization encoding scheme 300).Otherwise, the method 700 may proceed to step 730 to encode the currentregion in a lossy mode (e.g., using the lossy encoding scheme 500).

Next, in step 706, a residual block may be generated for each block ofthe current region. To generate the residual block, a RDO module (e.g.,the RDO module 110 in FIG. 1) may make logic decisions, such asselecting a best block partitioning scheme for the current region, aswell as determining a best inter or intra prediction mode for a currentblock (e.g., a PU). Based on logic decisions of the RDO module, aprediction module (e.g., the prediction module 120) may generate aprediction block, which may then be subtracted from the current block toobtain the residual block.

Next, in step 708, the method 700 may determine if a transform stepshould be bypassed for luma and/or chroma components of the currentblock, which may be implemented through the RDO module. If the conditionin the block 708 is met, the method 700 may proceed to step 710, whereone or more transform bypass flags for the current block may be set to‘1’. Otherwise, the method 700 may proceed to step 720, where the one ormore transform bypass flags may be set to ‘0’. The binary value may bearbitrary set. For example, if desired, the one or more transform bypassflags may be set to ‘0’ in step 710 and ‘1’ in step 720. In use, lumaand chroma components may use separate transform bypass flags. If thetwo components always use a same encoding scheme, they may also share atransform bypass flag.

Step 710 may be followed by step 712, where the residual block may beencoded using an entropy encoder (e.g., the entropy encoder 130 inFIG. 1) into a compressed bitstream. The entropy encoder may use anysuitable algorithm, such as a CABAC algorithm. In addition, the one ormore ‘1’ transform bypass flags may be encoded by the entropy encoder.In an embodiment, three context models may be used for each of the lumaand chroma components.

Step 720 may be followed by step 722, where the residual block may beconverted in a transform module (e.g., the transform module 330 in FIG.3) into a two-dimensional matrix of transform coefficients. Thetransform module may use any suitable transform, such as an integer DCTtransform an integer DCT-like transform. Next, in step 724, thetransform coefficients may be encoded using an entropy encoder (e.g.,the entropy encoder 340 in FIG. 3) into a compressed bitstream. Inaddition, the one or more ‘0’ transform bypass flags may be encoded bythe entropy encoder.

If a lossy encoding mode is chosen for the current region in step 704,the method 700 may proceed to step 730, where a residual block may begenerated for each block of the current region. To generate the residualblock, a RDO module (e.g., the RDO module 510 in FIG. 5) may select ablock partitioning scheme for the current region and an inter or intraprediction mode for a current block (e.g., a PU). Based on logicdecisions of the RDO module, a prediction module (e.g., the predictionmodule 520) may generate a prediction block, which may then besubtracted from the current block to obtain the residual block. Next, instep 732, the residual block may be converted in a transform module(e.g., the transform module 530) into a matrix of transformcoefficients. Next, in step 734, the matrix may be quantized in aquantization module (e.g., the quantization module 540) into anothermatrix of quantized transform coefficients. Next, in step 736, thequantized transform coefficients may be encoded using an entropy encoder(e.g., the entropy encoder 550) into the bitstream which may alreadyhave the region indication information.

Each block of the current region may be encoded using some of steps702-736. In an embodiment, after encoding all blocks in the currentregion, in step 740, the bitstream may be transmitted, for example, overa network to a decoder. It should be understood that the method 700 mayonly include a portion of all necessary encoding steps, thus othersteps, such as de-quantization and inverse transform, may also beincorporated into the encoding process wherever necessary.

FIG. 8 is a flowchart of an embodiment of a decoding method 800, whichmay correspond to the encoding method 700 and may implement some or allof the aforementioned decoding schemes in a video decoder. The method800 may start in step 802, where a bitstream comprising a sequence ofvideo frames may be received. Next, in step 804, a high level syntax(e.g., SPS or PPS) of the bitstream may be checked for region indicationinformation, which may signal which regions in a frame or a set offrames have been encoded in a lossless mode. Next, in step 806, based onthe region indication information, the method 800 may determine if aregion (e.g., rectangular) currently being decoded has been encoded in alossless mode. If the condition in the block 806 is met, the method 800may proceed to step 808 to decode the current region in a lossless mode(e.g., using the transform bypass decoding scheme 200 and/or thetransform without quantization decoding scheme 400). Otherwise, themethod 800 may proceed to step 830 to decode the current region in alossy mode (e.g., using the lossy decoding scheme 500).

For each block of the current region, in step 808, one or more encodedtransform bypass flags may be decoded in an entropy decoder (e.g., theentropy decoder 210 in FIG. 2), which may perform the inverse of anentropy encoder. If luma and chroma components of a current block useseparate transform bypass flags, two flags may be decoded for thecurrent block. Alternatively, if the luma and chroma components share atransform bypass flag, one flag may be decoded. Next, in step 810, themethod 800 may determine if the transform bypass flag is ‘1’. Asmentioned above, a transform bypass flag of ‘1’ may indicate that atransform step has been bypassed in the encoding process of the currentblock, and a transform bypass flag of ‘0’ may indicate that a transformstep has been used without quantization. It should be understood thatthe binary value here may be interpreted based on a correspondingencoding method (e.g., the method 700). For example, if the method 700reverses the meaning of ‘1’ and ‘0’, the method 800 may also be adjustedaccordingly. If the condition in the block 810 is met, the method 800may proceed to step 812, where a residual block of the current block maybe decoded using the entropy decoder into an uncompressed format.Otherwise, the method 800 may proceed to step 820, where a matrix oftransform coefficients may be decoded using the entropy decoder. Step820 may be followed by step 822, where the transform coefficients may beconverted to a residual block of the current block using an inversetransform module (e.g., the inverse transform module 420 in FIG. 4).

If the current region needs to be decoded in a lossy decoding mode(determined by block 806), the method 800 may proceed to step 830, wherea matrix of quantized transform coefficients may be decoded in anentropy decoder (e.g., the entropy decoder 610 in FIG. 6). Next, in step832, the quantized transform coefficients may be de-quantized to recoveran original scale of the transform coefficients. Next, in step 834, thetransform coefficients may be inverse transformed to a residual block ofthe current block.

After obtaining the residual block using either a lossless or lossydecoding mode, in step 840, a prediction block may be generated. Theprediction block may be based on information (decoded from the bitstreamusing the entropy encoder) comprising a prediction mode, as well as oneor more previously coded frames or blocks. Next, in step 842, theresidual block may be added to the prediction block, thus generating areconstructed block. Depending on the encoding and decoding schemesused, the reconstructed block may be an exact, approximate, or noisyversion of the original block (before encoding). Barring distortionintroduced during transmission, all information from the original blockmay be preserved in transform bypass coding. Depending on properties oftransform and inverse transform, all (or nearly all) information may bepreserved in transform without quantization coding. Certain informationmay be lost in lossy coding, and the degree of loss may mostly depend onthe quantization and de-quantization steps. To facilitate continuousdecoding of blocks, some pixels of the reconstructed block may alsoserve as reference pixels for decoding of future blocks. Likewise, thecurrent frame may also serve as a reference frame for decoding of futureframes.

As mentioned previously, when a sequence of video frames is being coded,sometimes certain regions may remain stable for a relatively long periodof time. For example, in video conferencing applications, a backgroundregion of each user may remain unchanged for tens of minutes. Foranother example, in computer screen sharing applications (e.g., used inonline video gaming), one or more regions containing text and/orgraphics may remain unchanged for tens of seconds or minutes. Sincecontinuous coding of these stable regions may consume unnecessarycomputation resource and time, it may be desirable to skip these regionsfrom the coding process.

In use, a RDO module (e.g., the RDO module 110 in FIG. 1, the RDO module310 in FIG. 3, or the RDO module 510 in FIG. 5) may initiate a forcedskip mode (also referred to hereafter as a skip mode). Consider, forexample, a CU currently being encoded in a P-slice. It should be notedthat any other type of block (e.g., macroblock or PU) and any other typeof slice or frame (e.g., B-slice, I-slice, P-frame, B-frame, I-frame)may be coded using a same or similar skip mode. The RDO module mayselect an optimal coding mode for the current CU in the P-slice. In anembodiment, before generating any residual value, the current CU may befirst compared with one or more corresponding CUs (referred to hereafteras reference CUs) positioned at a same position in one or more referenceslices. The reference slices of the P-slice may be of any type. If anexact match is found between all corresponding pixels of the current CUand a reference CU, a forced skip mode may be determined as the optimalcoding mode for the current CU. Alternatively, in an embodiment, ifdifferences between all corresponding pixels of the current CU and thereference CU are found to be within a small pre-set boundary (e.g., ±1),the forced skip mode may also be determined as the optimal coding modefor the current CU.

In the forced skip mode, the RDO module may skip the rest of the RDO andcoding steps for the current CU, which may improve encoding speed. Forexample, the RDO module may skip a RDO process where RD or bit ratecosts are calculated in various coding modes (e.g., various inter/intraprediction modes and/or PU/TU partitions). Instead, the current CU maybe flagged or signaled as a skipped CU. Information identifying theskipped CU and its matching reference CU may be included in a bitstream.In an embodiment, for each of the skipped CU and its matching referenceCU, the signaling information may comprise a size and/or a plurality ofcoordinates (e.g. top-left and bottom-right coordinates, or top-rightand bottom-left coordinates). No residual value or transform coefficientof the skipped CU may be needed in the bitstream.

Upon receiving of the bitstream, a video decoder may check to see if acurrent CU has been encoded in a forced skip mode based on signalinginformation contained in the bitstream. If yes, then pixel values of thematching reference CU may be used to reconstruct the current CU. Sincethere may be potentially a large number of CUs that may be coded in theforced skip mode, the bit rate of coding these CUs may be significantlyreduced. Further, the coding process may be made faster, and computationresources may be saved accordingly.

FIG. 9 illustrates an embodiment of an encoding mode selection method900. The method 900 may be complimentary to the encoding method 700 inFIG. 7, thus, if desired, both methods may be implemented in a sameencoder. The method 900 may start in step 910, where a current block(e.g., a CU or macroblock) may be compared with one or morecorresponding reference blocks. The corresponding reference blocks maybe located in a reference frame or slice of any type. In an embodiment,luma and/or chroma components of each pixel within the current block maybe compared with luma and/or chroma components of each correspondingpixel (located at a same position) within the one or more referenceblocks. A difference of pixel value may be generated for each pair ofcompared pixels.

Next, in step 920, the method 900 may determine if all differences arewithin a pre-set boundary or tolerance or range (e.g., ±1). If thecondition in the block 920 is met, the method 900 may proceed to step930. Otherwise, the method 900 may proceed to step 940. In step 930,information may be included to the bitstream to signal that the currentblock is encoded in a forced skip mode. Information may identify theskipped CU and its matching reference CU. In an embodiment, for each ofthe skipped CU and its matching reference CU, the signaling informationmay comprise a size and/or a plurality of coordinates (e.g. top-left andbottom-right coordinates, or top-right and bottom-left coordinates). Therest of encoding steps (e.g., RDO mode selection, encoding of residualblock) may be skipped for the current block.

In step 940, the method 900 may determine if the current block islocated within a lossless encoding region. If the condition in the block920 is met, the method 900 may proceed to step 950. Otherwise, themethod 900 may proceed to step 960. In step 950, an encoding modeleading to a least number of bits may be selected as an optimal mode.The optimal mode may be determined by a RDO module (e.g., the RDO module110 in FIG. 1 or the RDO module 310 in FIG. 3), which may test aplurality of combinations of various block sizes, motion vectors, interprediction reference frames, intra prediction modes, and/or referencepixels. Since no distortion or only slight distortion may be induced inthe lossless mode, the RDO module may exclude the distortion portion ofa RD cost function in determining the optimal coding mode. Next, in step960, the current block may be encoded in a lossless mode using atransform bypass lossless coding scheme and/or a transform withoutquantization coding scheme.

In step 970, an encoding mode leading to a smallest RD cost may beselected as an optimal mode. The RD cost of different encoding modes maytake into account both the bit rate portion and the distortion portionin determining the optimal coding mode. Next, in step 980, the currentblock may be encoded in a lossy mode using a lossy encoding scheme. Itshould be understood that the method 900 may only include a portion ofall necessary encoding steps, thus other steps, such as transform,quantization, de-quantization, inverse transform, and transmission, mayalso be incorporated into the encoding process wherever appropriate.

FIG. 10 illustrates an embodiment of a network unit 1000, which maycomprise an encoder and decoder that processes video frames as describedabove, for example, within a network or system. The network unit 1000may comprise a plurality of ingress ports 1010 and/or receiver units(Rx) 1012 for receiving data from other network units or components,logic unit or processor 1020 to process data and determine which networkunit to send the data to, and a plurality of egress ports 1030 and/ortransmitter units (Tx) 1032 for transmitting data to the other networkunits. The logic unit or processor 1020 may be configured to implementany of the schemes described herein, such as the transform bypassencoding scheme 100, the transform without quantization encoding scheme300, at least one of the encoding method 700 and the decoding method800, and/or the encoding mode selection method 900. The logic unit 1020may be implemented using hardware, software, or both.

The schemes described above may be implemented on any general-purposenetwork component, such as a computer or network component withsufficient processing power, memory resources, and network throughputcapability to handle the necessary workload placed upon it. FIG. 11illustrates a schematic diagram of a typical, general-purpose networkcomponent or computer system 1100 suitable for implementing one or moreembodiments of the methods disclosed herein, such as the encoding method700 and the decoding method 800. The general-purpose network componentor computer system 1100 includes a processor 1102 (which may be referredto as a central processor unit or CPU) that is in communication withmemory devices including secondary storage 1104, read only memory (ROM)1106, random access memory (RAM) 1108, input/output (I/O) devices 1110,and network connectivity devices 1112. Although illustrated as a singleprocessor, the processor 1102 is not so limited and may comprisemultiple processors. The processor 1102 may be implemented as one ormore CPU chips, cores (e.g., a multi-core processor), field-programmablegate arrays (FPGAs), application specific integrated circuits (ASICs),and/or digital signal processors (DSPs), and/or may be part of one ormore ASICs. The processor 1102 may be configured to implement any of theschemes described herein, including the transform bypass encoding scheme100, the transform without quantization encoding scheme 300, at leastone of the encoding method 700 and the decoding method 800, and/or theencoding mode selection method 900. The processor 1102 may beimplemented using hardware, software, or both.

The secondary storage 1104 is typically comprised of one or more diskdrives or tape drives and is used for non-volatile storage of data andas an over-flow data storage device if the RAM 1108 is not large enoughto hold all working data. The secondary storage 1104 may be used tostore programs that are loaded into the RAM 1108 when such programs areselected for execution. The ROM 1106 is used to store instructions andperhaps data that are read during program execution. The ROM 1106 is anon-volatile memory device that typically has a small memory capacityrelative to the larger memory capacity of the secondary storage 1104.The RAM 1108 is used to store volatile data and perhaps to storeinstructions. Access to both the ROM 1106 and the RAM 1108 is typicallyfaster than to the secondary storage 1104.

At least one embodiment is disclosed and variations, combinations,and/or modifications of the embodiment(s) and/or features of theembodiment(s) made by a person having ordinary skill in the art arewithin the scope of the disclosure. Alternative embodiments that resultfrom combining, integrating, and/or omitting features of theembodiment(s) are also within the scope of the disclosure. Wherenumerical ranges or limitations are expressly stated, such expressranges or limitations should be understood to include iterative rangesor limitations of like magnitude falling within the expressly statedranges or limitations (e.g., from about 1 to about 10 includes, 2, 3, 4,etc.; greater than 0.10 includes 0.11, 0.12, 0.13, etc.). For example,whenever a numerical range with a lower limit, R₁, and an upper limit,R_(u), is disclosed, any number falling within the range is specificallydisclosed. In particular, the following numbers within the range arespecifically disclosed: R=R₁+k*(R_(u)-R₁), wherein k is a variableranging from 1 percent to 100 percent with a 1 percent increment, i.e.,k is 1 percent, 2 percent, 3 percent, 4 percent, 7 percent, . . . , 70percent, 71 percent, 72 percent, . . . , 95 percent, 96 percent, 97percent, 98 percent, 99 percent, or 100 percent. Moreover, any numericalrange defined by two R numbers as defined in the above is alsospecifically disclosed. The use of the term about means ±10% of thesubsequent number, unless otherwise stated. Use of the term “optionally”with respect to any element of a claim means that the element isrequired, or alternatively, the element is not required, bothalternatives being within the scope of the claim. Use of broader termssuch as comprises, includes, and having should be understood to providesupport for narrower terms such as consisting of, consisting essentiallyof, and comprised substantially of. Accordingly, the scope of protectionis not limited by the description set out above but is defined by theclaims that follow, that scope including all equivalents of the subjectmatter of the claims. Each and every claim is incorporated as furtherdisclosure into the specification and the claims are embodiment(s) ofthe present disclosure. The discussion of a reference in the disclosureis not an admission that it is prior art, especially any reference thathas a publication date after the priority date of this application. Thedisclosure of all patents, patent applications, and publications citedin the disclosure are hereby incorporated by reference, to the extentthat they provide exemplary, procedural, or other details supplementaryto the disclosure.

While several embodiments have been provided in the present disclosure,it may be understood that the disclosed systems and methods might beembodied in many other specific forms without departing from the spiritor scope of the present disclosure. The present examples are to beconsidered as illustrative and not restrictive, and the intention is notto be limited to the details given herein. For example, the variouselements or components may be combined or integrated in another systemor certain features may be omitted, or not implemented.

In addition, techniques, systems, subsystems, and methods described andillustrated in the various embodiments as discrete or separate may becombined or integrated with other systems, modules, techniques, ormethods without departing from the scope of the present disclosure.Other items shown or discussed as coupled or directly coupled orcommunicating with each other may be indirectly coupled or communicatingthrough some interface, device, or intermediate component whetherelectrically, mechanically, or otherwise. Other examples of changes,substitutions, and alterations are ascertainable by one skilled in theart and may be made without departing from the spirit and scopedisclosed herein.

1. An apparatus comprising: a processor configured to: receive a currentblock of a video frame; and determine a coding mode for the currentblock based on only a bit rate cost function, wherein the coding mode isselected from a plurality of available coding modes, and whereincalculation of the bit rate cost function does not consider distortionof the current block.
 2. The apparatus of claim 1, wherein the codingmode results in a least number of bits needed to encode the currentblock compared with all other coding modes in the plurality of availablecoding modes.
 3. The apparatus of claim 2, wherein the current block isencoded using a transform bypass encoding scheme, wherein a transformstep and a quantization step are bypassed in the transform bypassencoding scheme.
 4. The apparatus of claim 2, wherein the current blockis encoded using a transform without quantization encoding scheme,wherein a quantization step is bypassed in the transform withoutquantization encoding scheme.
 5. A method comprising: receiving acurrent block of a video frame; and determining a coding mode for thecurrent block based on only a bit rate cost function, wherein the codingmode is selected from a plurality of available coding modes, and whereincalculation of the bit rate cost function does not consider distortionof the current block.
 6. The method of claim 5, wherein the coding moderesults in a least number of bits needed to encode the current blockcompared with all other coding modes in the plurality of availablecoding modes.
 7. The method of claim 6, wherein the current block isencoded using a transform bypass encoding scheme, wherein a transformstep and a quantization step are bypassed in the transform bypassencoding scheme.
 8. The method of claim 6, wherein the current block isencoded using a transform without quantization encoding scheme, whereina quantization step is bypassed in the transform without quantizationencoding scheme.
 9. An apparatus used in video coding comprising: aprocessor configured to: for each of a plurality of pixels in a block,determine a difference with one of a plurality of corresponding pixelsin a reference block, wherein each difference is based on two colorvalues of a pair of compared pixels; and if each of the differences iswithin a pre-set boundary, generate information to signal the block as askipped block, wherein the information identifies the block and thereference block, and include the information into a bitstream withoutfurther encoding of the block.
 10. The apparatus of claim 9, wherein theblock is a coding unit (CU), and wherein the reference block is areference CU.
 11. The apparatus of claim 10, wherein the informationcomprises: a plurality of coordinates of the CU; and a plurality ofcoordinates of the reference CU.
 12. The apparatus of claim 10, whereinthe pre-set boundary is ±1.
 13. The apparatus of claim 10, wherein thepre-set boundary is
 0. 14. The apparatus of claim 12, wherein the blockis located at a first position in a video frame, wherein the referenceblock is located at a second position in a reference video frame,wherein the first position and second position are equal in coordinates,wherein each pair of compared pixels are located at a same position inthe block and the reference block, wherein the video frame or thereference video frame is a predicted frame (P-frame), an intra-codedframe (I-frame), or a bi-directionally predicted frame (B-frame). 15.The apparatus of claim 12, wherein the block is located at a firstposition in a video slice, wherein the reference block is located at asecond position in a reference video slice, wherein the first positionand second position are equal in coordinates, wherein each pair ofcompared pixels are located at a same position in the block and thereference block, wherein the video slice or the reference video slice isa predicted slice (P-slice), an intra-coded slice (I-slice), or abi-directionally predicted slice (B-slice).
 16. The apparatus of claim12, wherein the processor is further configured to: if any of thedifferences exceeds the pre-set boundary, determine a coding mode forthe block based on only a bit rate cost function, wherein the codingmode is selected from a plurality of available coding modes, whereincalculation of the bit rate cost function does not consider distortionof the block, and wherein the coding mode results in a least number ofbits needed to encode the current block compared with all other codingmodes in the plurality of available coding modes.
 17. A method used invideo coding comprising: for each of a plurality of pixels in a block,determining a difference with one of a plurality of corresponding pixelsin a reference block, wherein each difference is based on two colorvalues of a pair of compared pixels; and if each of the differences iswithin a pre-set boundary, generating information to signal the block asa skipped block, wherein the information identifies the block and thereference block, and including the information into a bitstream withoutfurther encoding of the block.
 18. The method of claim 17, wherein theblock is a coding unit (CU), and wherein the reference block is areference CU.
 19. The method of claim 18, wherein the informationcomprises: a plurality of coordinates of the CU; and a plurality ofcoordinates of the reference CU.
 20. The method of claim 18, wherein thepre-set boundary is ±1.
 21. The method of claim 18, wherein the pre-setboundary is
 0. 22. The method of claim 20, wherein the block is locatedat a first position in a video frame, wherein the reference block islocated at a second position in a reference video frame, wherein thefirst position and second position are equal in coordinates, whereineach pair of compared pixels are located at a same position in the blockand the reference block, wherein the video frame or the reference videoframe is a predicted frame (P-frame), an intra-coded frame (I-frame), ora bi-directionally predicted frame (B-frame).
 23. The method of claim20, wherein the block is located at a first position in a video slice,wherein the reference block is located at a second position in areference video slice, wherein the first position and second positionare equal in coordinates, wherein each pair of compared pixels arelocated at a same position in the block and the reference block, whereinthe video slice or the reference video slice is a predicted slice(P-slice), an intra-coded slice (I-slice), or a bi-directionallypredicted slice (B-slice).
 24. The method of claim 20, furthercomprising: if any of the differences exceeds the pre-set boundary,determining a coding mode for the block based on only a bit rate costfunction, wherein the coding mode is selected from a plurality ofavailable coding modes, wherein calculation of the bit rate costfunction does not consider distortion of the block, and wherein thecoding mode results in a least number of bits needed to encode thecurrent block compared with all other coding modes in the plurality ofavailable coding modes.